FastJoin, an improved neighbor-joining algorithm.
نویسندگان
چکیده
Reconstructing the evolutionary history of a set of species is an elementary problem in biology, and methods for solving this problem are evaluated based on two characteristics: accuracy and efficiency. Neighbor-joining reconstructs phylogenetic trees by iteratively picking a pair of nodes to merge as a new node until only one node remains; due to its good accuracy and speed, it has been embraced by the phylogeny research community. With the advent of large amounts of data, improved fast and precise methods for reconstructing evolutionary trees have become necessary. We improved the neighbor-joining algorithm by iteratively picking two pairs of nodes and merging as two new nodes, until only one node remains. We found that another pair of true neighbors could be chosen to merge as a new node besides the pair of true neighbors chosen by the criterion of the neighbor-joining method, in each iteration of the clustering procedure for the purely additive tree. These new neighbors will be selected by another iteration of the neighbor-joining method, so that they provide an improved neighbor-joining algorithm, by iteratively picking two pairs of nodes to merge as two new nodes until only one node remains, constructing the same phylogenetic tree as the neighbor-joining algorithm for the same input data. By combining the improved neighbor-joining algorithm with styles upper bound computation optimization of RapidNJ and external storage of ERapidNJ methods, a new method of reconstructing phylogenetic trees, FastJoin, was proposed. Experiments with sets of data showed that this new neighbor-joining algorithm yields a significant speed-up compared to classic neighbor-joining, showing empirically that FastJoin is superior to almost all other neighbor-joining implementations.
منابع مشابه
A fast neighbor joining method.
With the rapid development of sequencing technologies, an increasing number of sequences are available for evolutionary tree reconstruction. Although neighbor joining is regarded as the most popular and fastest evolutionary tree reconstruction method [its time complexity is O(n(3)), where n is the number of sequences], it is not sufficiently fast to infer evolutionary trees containing more than...
متن کاملNeighbor joining with phylogenetic diversity estimates
The Neighbor-Joining algorithm is a recursive procedure for reconstructing trees that is based on a transformation of pairwise distances between leaves. We present a generalization of the neighborjoining transformation, which uses estimates of phylogenetic diversity rather than pairwise distances in the tree. This leads to an improved neighbor-joining algorithm whose total running time is still...
متن کاملBeyond pairwise distances: neighbor-joining with phylogenetic diversity estimates.
The "neighbor-joining algorithm" is a recursive procedure for reconstructing trees that is based on a transformation of pairwise distances between leaves. We present a generalization of the neighbor-joining transformation, which uses estimates of phylogenetic diversity rather than pairwise distances in the tree. This leads to an improved neighbor-joining algorithm whose total running time is st...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genetics and molecular research : GMR
دوره 11 3 شماره
صفحات -
تاریخ انتشار 2012